Knowledge Reduction and Discovery

نویسنده

  • Yuguo He
چکیده

Knowledge reduction, includes attribute reduction and value reduction, is an important topic in rough set literature. It is also closely relevant to other fields, such as machine learning and data mining. In this paper, an algorithm called TWI-SQUEEZE is proposed. It is so named because it can find a reduct, or an irreducible attribute subset that maintains certainty of classification, after two scans, which is a bit similar to the process of squeezing water from sponge by pressing on both sides of the sponge. Its soundness and computational complexity are given, which show that it is the fastest algorithm at present. A measure of difference contained in a system, or variety, is brought forward. The quantity character of it is measured by demarcation information, of which algorithm TWI-SQUEEZE can be regarded as an application. Demarcation information measure can also be used as heuristic information to guide the algorithm to find a suboptimal reduct. In this paper, this measure will be compared with Shannon entropy in details. Some basic concepts, such as uncertainty, distinctiveness, variety, similarity, difference and their relationships will be studied. The author also argues the rightness of this measure as a measure of information, which can make it a unified measure for “differentiation”, a concept appeared in cognitive psychology literature. Value reduction is another important aspect of knowledge reduction. It is interesting that using the same algorithm we can execute a complete value reduction efficiently. A complete knowledge reduction, which results in an irreducible table, can therefore be accomplished after four scans of table. The byproducts of reduction are two classifiers of different styles. Traditionally, attribute reduction is regarded as data preprocessing phase in knowledge discovery. The author will show in this paper that, knowledge reduction, especially approximate knowledge reduction which is based on attribute reduction, is a process of data mining. In this paper, various cases and models will be discussed to prove the efficiency and effectiveness of the algorithm. Some topics, such as how to integrate user preference to find a local optimal attribute subset will also be discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing an Ontology for Knowledge Discovery in Iran’s Vaccine

Ontology is a requirement engineering product and the key to knowledge discovery. It includes the terminology to describe a set of facts, assumptions, and relations with which the detailed meanings of vocabularies among communities can be determined. This is a qualitative content analysis research. This study has made use of ontology for the first time to discover the knowledge of vaccine in Ir...

متن کامل

Drug Discovery Acceleration Using Digital Microfluidic Biochip Architecture and Computer-aided-design Flow

A Digital Microfluidic Biochip (DMFB) offers a promising platform for medical diagnostics, DNA sequencing, Polymerase Chain Reaction (PCR), and drug discovery and development. Conventional Drug discovery procedures require timely and costly manned experiments with a high degree of human errors with no guarantee of success. On the other hand, DMFB can be a great solution for miniaturization, int...

متن کامل

Cluster Based Cross Layer Intelligent Service Discovery for Mobile Ad-Hoc Networks

The ability to discover services in Mobile Ad hoc Network (MANET) is a major prerequisite. Cluster basedcross layer intelligent service discovery for MANET (CBISD) is cluster based architecture, caching ofsemantic details of services and intelligent forwarding using network layer mechanisms. The cluster basedarchitecture using semantic knowledge provides scalability and accuracy. Also, the mini...

متن کامل

The Relationship between Knowledge Management and the Process of Entrepreneurship in Sport Organizations

In the current competitive world, organizations can reach competitive advantage which support entrepreneurship by providing the required tools. One of the most important tools for developing entrepreneurship which was neglected in previous studies is organizational knowledge management. The present paper aims to shed light on the role and importance of knowledge management in sport entreprene...

متن کامل

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004